An In-place Framework for Exact and Approximate Shortest Unique Substring Queries

نویسندگان

  • Wing-Kai Hon
  • Sharma V. Thankachan
  • Bojian Xu
چکیده

We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and O(n) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Shortest Unique Substring Queries on Run-Length Encoded Strings

We consider the problem of answering shortest unique substring (SUS) queries on run-length encoded strings. For a string S, a unique substring u = S[i..j] is said to be a shortest unique substring (SUS) of S containing an interval [s, t] (i ≤ s ≤ t ≤ j) if for any i′ ≤ s ≤ t ≤ j′ with j − i > j′ − i′, S[i′..j′] occurs at least twice in S. Given a run-length encoding of size m of a string of len...

متن کامل

Shortest Unique Queries on Strings

Let D be a long input string of n characters (from an alphabet of size up to 2 , wherew is the number of bits in a machine word). Given a substring q of D, a shortest unique query returns a shortest unique substring of D that contains q. We present an optimal structure that consumes O(n) space, can be built in O(n) time, and answers a query in O(1) time. We also extend our techniques to solve s...

متن کامل

Processing Queries on Road Networks in Spatial Data Base Perspective for Selectivity Estimation

This work mainly focuses on building a framework that is capable of analyzing spatial approximate substring queries, for mainly to solve the selectivity estimation problem of range queries which belongs to road networks represented in spatial databases. The selectivity estimation is nothing but estimating the size of the results i.e., estimating the number of points that presents in a graph whi...

متن کامل

A simple yet time-optimal and linear-space algorithm for shortest unique substring queries

Article history: Received 30 March 2014 Accepted 7 November 2014 Available online 13 November 2014 Communicated by G. Ausiello

متن کامل

Shortest unique palindromic substring queries in optimal time

A palindrome is a string that reads the same forward and backward. A palindromic substring P of a string S is called a shortest unique palindromic substring (SUPS) for an interval [s, t] in S, if P occurs exactly once in S, this occurrence of P contains interval [s, t], and every palindromic substring of S which contains interval [s, t] and is shorter than P occurs at least twice in S. The SUPS...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015